AITopics

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry:

Health & Medicine > Therapeutic Area (0.98)
Health & Medicine > Diagnostic Medicine > Imaging (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Neural Information Processing SystemsFeb-9-2026, 01:46:16 GMT

6516c28727509c3db6280ae16254e916-Supplemental.pdf

augmentation, prelax, representation, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-21-2025

Intelligent Communication Mixture-of-Experts Boosted-Medical Image Segmentation Foundation Model

Zhang, Xinwei, Chen, Hu, Yuan, Zhe, Tian, Sukun, Feng, Peng

Foundation models for medical image segmentation have achieved remarkable performance. Adaptive fine-tuning of natural image segmentation foundation models is crucial for medical image segmentation tasks. However, some limitations exist in existing fine-tuning methods: 1) insufficient representation of high-level features and 2) the fine-tuning process disrupts the structural integrity of pretrained weights. Inspired by these critical problems, we propose an intelligent communication mixture-of-experts boosted-medical image segmentation foundation model, named IC-MoE, with twofold ideas: 1) We construct basic experts, semantic experts, and adaptive experts. Moreover, we implement a pixel probability adaptive voting strategy, which enables expert selection and fusion through label consistency and load balancing. This approach preliminarily enhances the representation capability of high-level features while preserving the structural integrity of pretrained weights. 2) We propose a semantic-guided contrastive learning method to address the issue of weak supervision in contrastive learning. This method further enhances the representation capability of high-level features while preserving the structural integrity of pretrained weights. Extensive experiments across three public medical image segmentation datasets demonstrate that the IC-MoE outperforms other SOTA models. Consequently, the proposed IC-MoE effectively supplements foundational medical image segmentation models with high-level features and pretrained structural integrity. We also validate the superior generalizability of the IC-MoE across diverse medical image segmentation scenarios.

artificial intelligence, machine learning, natural language, (18 more...)

2510.17684

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-1-2025

Guess-and-Learn (G&L): Measuring the Cumulative Error Cost of Cold-Start Adaptation

Arnold, Roland

Evaluation of machine learning models typically emphasizes final accuracy, overlooking the cost of adaptation: the cumulative errors incurred while learning from scratch. Guess-and- Learn (G&L) v1.0 addresses this gap by measuring cold-start adaptability - the total mistakes a model makes while sequentially labeling an unlabeled dataset. At each step, the learner selects an instance, predicts its label, receives the ground truth, and updates parameters under either online (per-sample) or batch (delayed) mode. The resulting error trajectory exposes adaptation speed, selection quality, and bias - dynamics invisible to endpoint metrics. G&L defines four tracks (Scratch/Pretrained $\times$ Online/Batch) to disentangle the effects of initialization and update frequency. We formalize the protocol, relate it to classical mistake-bound theory, and estimate a heuristic "oracle reference band" for MNIST as a plausibility reference. Baseline experiments on MNIST and AG News, spanning classical methods (Perceptron, k-NN), convolutional architectures (CNN, ResNet-50), and pretrained transformers (ViT-B/16, BERT-base), reveal systematic differences in early-phase efficiency: smaller models can adapt with fewer initial errors, while pretraining benefits vary by domain. Across settings, current models remain well above the oracle band, highlighting an adaptability gap. By quantifying the mistake cost of early learning, G&L complements conventional benchmarks and provides a reproducible framework for developing learners that are not only accurate in the limit but also reliable from the first examples.

artificial intelligence, machine learning, mnist, (16 more...)

2508.2127

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsAug-20-2025, 08:16:23 GMT

Transfusion: Understanding Transfer Learning for Medical Imaging

This basic formula has seen almost universal adoption across many different medical specialties.

initialization, pretrained weight, representation, (14 more...)

Country:

North America > United States (0.28)
North America > Canada (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Neural Information Processing SystemsAug-19-2025, 05:17:10 GMT

d36dfcdb14473a8526111c221660f2ab-Paper-Conference.pdf

artificial intelligence, invariance, machine learning, (17 more...)

Country:

North America > United States > Texas > El Paso County > El Paso (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

arXiv.org Artificial IntelligenceAug-19-2025

Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Li, Shiwei, Luo, Xiandi, Tang, Xing, Wang, Haozhao, Chen, Hao, Luo, Weihong, Li, Yuhua, He, Xiuqiang, Li, Ruixuan

Low-rank adaptation (LoRA) is a widely used parameter-efficient fine-tuning method. In standard LoRA layers, one of the matrices, $A$ or $B$, is initialized to zero, ensuring that fine-tuning starts from the pretrained model. However, there is no theoretical support for this practice. In this paper, we investigate the impact of non-zero initialization on LoRA's fine-tuning dynamics from an infinite-width perspective. Our analysis reveals that, compared to zero initialization, simultaneously initializing $A$ and $B$ to non-zero values improves LoRA's robustness to suboptimal learning rates, particularly smaller ones. Further analysis indicates that although the non-zero initialization of $AB$ introduces random noise into the pretrained weight, it generally does not affect fine-tuning performance. In other words, fine-tuning does not need to strictly start from the pretrained model. The validity of our findings is confirmed through extensive experiments across various models and datasets. The code is available at https://github.com/Leopold1423/non_zero_lora-icml25.

large language model, machine learning, natural language, (18 more...)

2505.23194

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-14-2025, 21:18:58 GMT

6516c28727509c3db6280ae16254e916-Supplemental.pdf

augmentation, prelax, representation, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Yang, Haodong, Wang, Lei, Hossain, Md Zakir

SRLoRA: Subspace Recomposition in Low-Rank Adaptation via Importance-Based Fusion and Reinitialization

arXiv.org Artificial IntelligenceMay-20-2025

Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient fine-tuning (PEFT) method that injects two trainable low-rank matrices (A and B) into frozen pretrained models. While efficient, LoRA constrains updates to a fixed low-rank subspace (Delta W = BA), which can limit representational capacity and hinder downstream performance. We introduce Subspace Recomposition in Low-Rank Adaptation (SRLoRA) via importance-based fusion and reinitialization, a novel approach that enhances LoRA's expressiveness without compromising its lightweight structure. SRLoRA assigns importance scores to each LoRA pair (a column of B and the corresponding row of A), and dynamically recomposes the subspace during training. Less important pairs are fused into the frozen backbone, freeing capacity to reinitialize new pairs along unused principal directions derived from the pretrained weight's singular value decomposition. This mechanism enables continual subspace refreshment and richer adaptation over time, without increasing the number of trainable parameters. We evaluate SRLoRA on both language and vision tasks, including the GLUE benchmark and various image classification datasets. SRLoRA consistently achieves faster convergence and improved accuracy over standard LoRA, demonstrating its generality, efficiency, and potential for broader PEFT applications.

machine learning, natural language, srlora, (15 more...)

2505.12433

Country: Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Bini, Massimo, Girrbach, Leander, Akata, Zeynep

Decoupling Angles and Strength in Low-rank Adaptation

arXiv.org Artificial IntelligenceMar-23-2025

Parameter-Efficient FineTuning (PEFT) methods have recently gained significant popularity thanks to the widespread availability of large-scale pretrained models. These methods allow for quick adaptation to downstream tasks with minimal computational cost. However, popular finetuning methods such as LoRA exhibit limited robustness when it comes to hyperparameter choices or extended training regimes, preventing optimal out-of-the-box performance. In contrast, bounded approaches, such as ETHER, provide greater robustness but are limited to extremely low-rank adaptations and fixed-strength transformations, reducing their adaptation expressive power. In this work, we propose Decoupled Lowrank Adaptation (DeLoRA), a novel finetuning method that normalizes and scales learnable low-rank matrices. Through evaluations on subject-driven image generation, natural language understanding, and instruction tuning, we show that DeLoRA matches or surpasses performance of competing PEFT methods, while exhibiting stronger robustness. The rapid advancement of deep learning has led to the development of large-scale pretrained models in various domains, especially in computer vision and natural language processing (Touvron et al., 2023a;b; Radford et al., 2021; Rombach et al., 2022). However, the enormous size of these models, reaching billions of parameters, presents significant challenges when adapting them to specific downstream tasks, particularly in terms of computational cost and efficiency. To address these challenges, Parameter Efficient FineTuning (PEFT) methods have emerged. PEFT methods are characterized by their introduction of a small set of learnable parameters, in contrast to the extensive parameter updates required in full finetuning. Notable examples include adapters (Houlsby et al., 2019) and prompt tuning (Lester et al., 2021).

artificial intelligence, machine learning, natural language, (20 more...)

2503.18225

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)